A Survey of Multilingual Text Retrieval
نویسندگان
چکیده
This report reviews the present state of the art in selection of texts in one language based on queries in another a problem we refer to as multilingual text retrieval Present applications of multilingual text retrieval systems are limited by the cost and complexity of developing and using the multilingual thesauri on which they are based and by the level of user training that is required to achieve satisfactory search e ective ness A general model for multilingual text retrieval is used to review the development of the eld and to describe modern production and experimental systems The report concludes with some observations on the present state of the art and an extensive bibliography of the technical literature on multilingual text retrieval The research reported herein was supported in part by Army Research O ce contract DAAL C through Battelle Corporation NSF NYI IRI Alfred P Sloan Research Fellow Award BR a General Research Board Semester Award and the Logos Corporation
منابع مشابه
A multilingual text mining approach to web cross-lingual text retrieval
To enable concept-based cross-lingual text retrieval (CLTR) using multilingual text mining, our approach will first discover the multilingual concept–term relationships from linguistically diverse textual data relevant to a domain. Second, the multilingual concept–term relationships, in turn, are used to discover the conceptual content of the multilingual text, which is either a document contai...
متن کاملA method for multilingual text mining and retrieval using growing hierarchical self-organizing maps
With the increasing amount of multilingual texts in the Internet, multilingual text retrieval techniques have become an important research issue. However, the discovery of relationships between different languages remains an open problem. In this paper we propose a method, which applied the growing hierarchical self-organizing map (GHSOM) model, to discover knowledge from multilingual text docu...
متن کاملDiscovering Parallel Text from the World Wide Web
Parallel corpus is a rich linguistic resource for various multilingual text management tasks, including crosslingual text retrieval, multilingual computational linguistics and multilingual text mining. Constructing a parallel corpus requires effective alignment of parallel documents. In this paper, we develop a parallel page identification system for identifying and aligning parallel documents ...
متن کاملMining bilingual topic hierarchies from unaligned text
Recent years have seen an exponential growth in the amount of multilingual text available on the web. This situation raises the need for novel applications for organizing and accessing multilingual content. Common examples of such applications include Multilingual Topic Tracking, Cross-Language Information retrieval systems etc. Most of these applications rely on the availability of multilingua...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996